Using DTW to compare sounds

Using librosa for instance, you can easily extract the MFCC of sound.

Compute the MFCCs of two sounds



In [1]:

    
import librosa

y1, sr1 = librosa.load('../../Downloads/tmp/sounds/10.wav')
y2, sr2 = librosa.load('../../Downloads/tmp/sounds/78.wav')



In [2]:

    
%pylab inline

subplot(1, 2, 1)
mfcc1 = librosa.feature.mfcc(y1, sr1)
librosa.display.specshow(mfcc1)

subplot(1, 2, 2)
mfcc2 = librosa.feature.mfcc(y2, sr2)
librosa.display.specshow(mfcc2)









    



Populating the interactive namespace from numpy and matplotlib






    Out[2]:





<matplotlib.image.AxesImage at 0x11276bd10>

Compare them using DTW



In [3]:

    
from dtw import dtw



In [4]:

    
dist, cost, path = dtw(mfcc1.T, mfcc2.T)
print 'Normalized distance between the two sounds:', dist









    



Normalized distance between the two sounds: 192.489808008



In [5]:

    
imshow(cost.T, origin='lower', cmap=cm.gray, interpolation='nearest')
plot(path[0], path[1], 'w')
xlim((-0.5, cost.shape[0]-0.5))
ylim((-0.5, cost.shape[1]-0.5))









    Out[5]:





(-0.5, 37.5)



In [ ]: